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CLIENT -SERVER  STANDARDS  FOR  TEXT:  FOUNDATION  FOR  INNOVATION 

How  do  I  access  thee?     Let  me  count  the  ways...     Dow  Jones  News  Retrieval 
has  one  interface;  Lotus  Magellan  another;  CompuServe  discussion  groups  a 
third;  our  wp  files  a  fourth;  Computer  Library's  Computer  Select  and  Infor- 
mation Access's  Magazine  Rack  (from  the  same  parent  company,   Ziff  Communica- 
tions!) yet  two  more.     Then  there's  IZE  and  Lotus  Notes,  Folio  Views  and 
cc:Mail,  Zylndex  and  The  WELL. 

All  this  at  a  time  when  a  single  user  interface  (that  is,  any  of  many  user 
interfaces)  offers  access  to  a  wide  variety  of  structured  data  sources,  and 
a  single  data  source  can  be  addressed  through  many  user  interfaces .  The 
promise  of  SQL  --  heterogeneous  access   to  structured  data  --   is  now  being 
realized,   and  makes  the  limitations  of  text  retrieval  more  apparent.  Over 
the  next  decade  we  will  need  to  handle  a  rapidly  increasing  volume  both  of 
unstructured  text  and  of  text  structured  in  clever,  nonstandard  ways  by  peo- 
ple and  by  products  such  as  Notes,   Verity's  Topic,  Folio  Views,   and  tools 
for  building  semi- structured  e-mail  messages,  forms  and  EDI  applications . 

This  issue  is  about  some  early  efforts   to  provide  SQL- like  facilities  for 
text  --  but  remember  that  it  took  a  decade  for  SQL  to  catch  on.     Perhaps  we 
can  do  it  faster  the  second  time  around,  as  information  proliferates  and  we 
demand  maps  and  signposts  lox  .all  the  territory  in  our  electronic  frontier: 

The  goal  is  that  a  given  text  front-end  can  retrieve  data  from  any  back-end, 
instead  of  the  situation  now  where  we  have  the  confusion  of  front-ends  de- 
scribed above.     As  with  data;   you  should  be  able  to  run  a  single  query 
against  your  own  files ,   against  structured  corporate  text  bases  and  against 
external  sources  such  as  Dow  Jones,  Reuters  or  Mead's  Lexis. 

The  oata  woric  has  long  had  SQL  (Structured  Query  Language)  .   a  neutral  lan- 
guage (and  at  official  standard)  for  describing  -databases  and  queuing  -ds~" 
that  works  across  platforms  and  databases 
emly  a  subset  of  a  multitude  of  diverse 
systems  that  don't  intexoperate .     It'i  a 
description  language ,  not  a  programming 
language .   and  can't  do  much  br  itself. 
But  of  course  that '  s  aj_so  its  virtue . 
People  hav«  beer,  innovating  around  SQL 
for  the  past  decade  and  will  continue  ~c 
ao  so  well  urto  the  21_sr  centurv . 


Detractors  point  out  that  SQL  is 
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such  as 


so  much  more  items  such  as  text 

.   footnotes,  paragraphs,  ""J"^;         and  maintenance  of  links, 

^tagorisation,   ^"^^f^  ^outlines hierarchies ,  table,  of  con- 
cross -references  and  structures  *ucn      additioh,  text  may  have  display- 
^ents  and  document  identif  ^  fan    styles:   character  sets; 

oriented  information:     fonts  «^!i^£\Ild  i^ges;  layout  and  formatting; 
graphics,  including  vectorization  of  fonts  and        g        .nformation  provlded 
nyphenation  and  justification.     ^%^f**"ds  Recognizing  text  objects, 
^another;  a  document's  repr.sent.tion  depend,  on         g  &  notation; 

with  headlines  displayed  one  way  ^^^irsr  three  paragraphs  of  any 
a  text-search  program  ^  Je£*        ^0  different  parts  of  a  document;  a 
document,   or  assign  different  weign 
table  of  contents  lists  subheads. 

i   n_  aTlother    but  to  handle  them  all  at  the 
All  these  are  related  at  one  level  or    ^ er  here  nave  to  do 

same  time  would  be  foolish      ^  ^  lgplay ,   layout,  or  other  pr- 

only  with  text  retrieval  and  intent    not  wi  V        addressed  by  standards 

sentation  and  document-processing  functions  and^^^^^  standards  attempt  to 
such  as  Adobe's  PostScript      In  £^  ^  ^  according.  to  a 

^   client  from  any  server< 


Serve  me  some  text 


•       a        four  ways  -  by  identity,  by  content,  by 
!-SSon-hCoth^  "SrC^.^ximlty,  etc,  or  by  criteria. 

Ummz  is  very  simple,   or  should  b..  k^Yl!  th. 

text,  which  can  be  assigned  a  unique  LU  n  .  ^  QuartermaI1  <  s 

servers  from  inadvertently  reusing  ^  °f  ^ticle  Rotable  Computer  B.t- 
1989  book  The  Matrix  a  ^on  of  hi.  1986  art  ^  ^  ^ 

works"  in  Communications  of  the  ACW?     What  a  Jota  ^^ff  the 

Which  is  the  real  article  ^V'^-^tSed  that  appeared  later  in 
one  in  the  New  York  Times,  °^^f  ^nslation^  Do  you  want  the  1989 
the  San  Jose  Mercury?    The  orxgxn.    or  tne^r  ^  ^  d? 

projections,   or  the  disappointing  1991  actual 

„rvn,rr-icrht  records  and  other  forms  of  au- 
Document  IDs  are  important  also  ^  =°^g  and  misquo tat ions)  .     They  allow 
thors'   rights   (cf.  other  documents,   including  the 

for  authors  to  make  specific  reference s  ^  ^  foundarion  ro: 

server (s)  where  thev  may  be  *<™^™*?J^ .     Ioeally ,   IDs  could  save 

copyright  protection  and  «^£f£LS^  ^t  incorporate  it 

people  repeating  others'  TT^^^^T-V  reference.     You  can  also 

notate  it,  praise  it.  derade  it  or  refute  it         -^.^  to  look  at 

use  a  referenced  document  as  tne  basis  ox  a  «  . 

the  document  itself  ^ 

-      „  <-  is  the  fuzziest  but  most,  universal  de^ 

Cor:zenr  means  "what  it's  an^ .     anc  -s  Defining  content  perfectly 

— ^IOT  of  a  text,   it  s  not  unique  -  _£rr™  ^sr«s  Content 

_:    th*  unacr^evac-t   ^'ensib.e  Eo~- ~-       "  ^  ^  ^  rbe  presence  of  ot^er 

c«  be  assessed  by  tne  presence  of  ■  ™*£  ^   Qf  def±TliT1£  rac  asses, 

wares     etc       There  are  a  variety  v.-     -Cueing  *.riW'.  tonic  hierarchies 
xnF  content   .see  Release  ..C       -~    ..1^.^  nets       and  ranging  ail  the  wav  : 

natural   ^anguage  parsing,  wni-r 

vns-         -     tal^inf  aDCmt 
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Associations  is  complex;     It  could  be  "all  texts  linked  to  'bolt  number  520J- 
Z2 '  . "    Or  It  could  be  "all  articles  cited  in  the  footnotes  in  chapter  13"  of 
a  particular  document,     Or  it  could  simply  Jbe  .-items  classified  in  a  --particu- 
lar category,   such  as  "life  in  the  fast  lane,"  rather  than  items  containing 
those  keywords . 

Criteria  are  what  would  be  called  values  in  a  database.     These  can  include 
sources  (publications,  publishers,   etc.),  authors,   dates  of  publication/ 
copyright,  and  assigned,   arbitrary  classifications  such  as  poetry  or  country 
of  origin  or  editor's  rating.     In  effect,   criteria  are  associations  with  a 
category  or  value  rather  than  with  a  specific  object. 

Obviously,  these  approaches  slide  into  each  other,  and  a  search  usually  in- 
cludes combinations  of  them.  For  example,  you  might  want  a  section  identi- 
fied by  content,  within  a  book  with  a  specific  identity. 

More  broadly,   there  are  two  approaches  --  unstructured  text,  where  you're 
relying  mostly  on  content,  and  structured  text,  where  criteria  and  associa- 
tions and  defined  elements  are  key.      (Note  that  Juan's  structure  may  be  ir- 
relevant or  confusing  or  misleading  to  Alice;   sometimes  the  goal  of  a  search 
may  be  to  find  what  nobody  knew  was  there.     Would  Sherlock  Holmes  rely  on 
information  structure  by  Doctor  Watson?)     This  distinction,  although  fuzzy, 
more  or  less  corresponds  to  the  difference  between: 

•  on-line,   dynamically  changing  information,  where  you  usually  search  by 
content  and  there's  likely  to  be  a  lot  of  redundancy  (and  large  volumes 
of  text  to  search:     What's  new  in  Leningrad?     What  are  people  saying 
about  the  new  version  of  WidgeText?     Let's  find  some  articles  that  men- 
tion Graham  Greene's  years  in  Haiti. 

•  CD-ROM,  structured  information,  where  you  typically  search  by  associa- 
tion or  criteria  for  something  in  particular,  perhaps  a  unique,  specific 
answer:     What  happens  if  this  bolt  is  unscrewed?1    Let's  see  what  our 
policy  is  on  paternity  leave  for  unmarried  fathers . 

However,   text  bases  of  periodicals  and  other  random  texts  stored  on  CD-ROM 
(basically,   on-line  services  on  disk)  tend  to  have  the  character  of  the  first 
group.     Of  the  three  would-be  standards  discussed  here,  VAXS  (for  Wide-Area 
Intonation  Servers)  is  oriented  to  on-line  information,  while  SFQL  (Struc- 
tured Full-teact  Query  Language;;  is  oriented  to  structured  CD-JB.QM  information. 
The  third.   CB-RDx  (for  CD  Read-only  Data  exchange)  is  designed  for  CD-ROMs, 
but  is  better  suited  to  unstructured  information  (or  less  optimized  for 
structure)  than  SFQL .      (Full  details  --  and  qualifications  of  these  gener- 
alizations       begin  on  page  6.) 

Text  retrieval  is  more  man  just  iu£uT  luation  for  researchers  ^r.^r^T^ 

it  also  supports  tasks  such,  as  running  help  desks ,  deriving,  ^uaiizarive  aea- 


su»s  for  muslag  press  coverage,  interpreting  and  responding  to  complaint 
^ter7  a""bling  precedents  for  legal  cases  or  other  dec  is  ion- making  pro- 
letters  .assenox    BP  Moreover,  if  yon  can  specify  a  text  ob- 
cesses,  and  many  other    soxr  automate  a  lot  of  work. 
>ct  and  procedures  to  act  on  text  ob 3«^~  ^ct_Mnipulation  work  not 

mibllsnine  systems  can  automate  previously  axrcuu  u»  p 
^r^ri^Setion  and  layout,  but  in  conditional  printing,  document  as- 

Client -server:  The  story  so  far... 

„-f  ,n»^  „nrer  is  a  database  server,  which  supplies  data 
The  co-on  notion  "^^'"^.J  ^  retrleved  by  SQL.     Then  you  write 

cU.nt^PUc      onf"  do"things  to  the  date  specified,  and  store 

ments  along  the  way.      VP  f  other  manipulations  such  as  number 

SSS'«  i^^^iTar  Polling  of  a  physical  measurement  device. 


Tools  such  as  Agility's  Wijit  (Release  1.0,  11-90)  or  Sand- 
point's  Hoover,  for  access  to  public  data  »?™"s_^t.? 
things,  are  designed  to  solve  the  text -retrieval   (TR)  inter- 
operability problem.     But  they  do  so  by  building  emulators/ 
fueries  for  each  front-end  to  talk  to  each  back-end.  Agility/ 
IZ&Bradstreet's  John  Landry  notes  the  problems  of  continual- 
ly changing  back-ends,  which  vendors  solve  by  updating  their 
lent  ends" simultaneously.     This  creates  few  problems  for  their 
clients  beyond  updates,   but  big  problems  for  companies  such  as 
Agility  or  third  parties  using  and  reselling  the  content.  The 
Standards  discussed  here  would  force  the  back-end  vendors  to 
hide  their  "innovations"  behind  an  insulating  layer  that  could 
interpret  the  standard  protocol.      (Wijit  does  the  work  at  the 
client,  creating  the  appropriate  messages  for  each  service  i. 
addresses  and  translating  them  back  and  forth  into  mail  mes- 
sages  for  the  user;   these  TR  standards  would  distribute  the  e.- 
fort  between  client  and  server.) 

But  SOL  is  a  productive  aberration  in  the  world  of  clients  and  servers . 
»ost  Sients  cannot  talk  to  most  servers.     Instead,  matched  pairs  comnn mica- 
rt  u^lSonrietary  protocols,  getting  the  benefits  of  distributed  data  and 
iLet^c^imLed^rfoxmance .  and  perhaps  security  or  transact 
^*        out  not  heterogeneous  access.     SQL  was  an  importance?  topmxj- 
^^J^geneous  access:  insulation  of  the  specifics  of  ""^f^f 
ipldfics  of  another,     let  there  are  performance  penalties  and  -  ^ 


iag  heterogeneous  access:  insulation  of  the  specifics  of  one  -^.t^tto 
specifics  of  anotnei 

t&te  for  client  anc  sciit.         ^-   ~- -  1  -■  -  , 

^  moved  around  from  server  to  server  or  client  to  client  (although  data 

does  move;       Most  vendors  and  developer,  actually  us*  supersets  _  S_  - 

and  thus  are  dependent  on  the  features   in  the  supersets 


"server  tr  be  eeve'iope*  and  installed  independently 


Re  lease 
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Client-server  applied  to  text 

So  how  does  text  fit  into  this  scheme?    Text-OTiented  systems  tools  can 
benefit  from  the  same  sort  of  architecture,  and  from  the  same  benefits  of 
insulation  through  a  common  protocol,  although  the  protocols  themselves  are 
different  from  SQL.     Indeed,  most  text-search  programs  already  use  a  rudi- 
mentary client-server  architecture:     The  terminals  are  clients,  and  the 
hosts  are  servers.     Most  of  the  intelligence  resides  in  the  hosts,  and  re- 
quires a  specific  form  of  input  from  the  clients,  which  are  mostly  dumbish 
terminals  that  know  only  how  to  log  on  and  validate  a  request's  syntax. 

There  are  other  kinds  of  examples,  of  course.     For  example,  you  can  inte- 
grate a  text  client  with  a  database  to  generate  boilerplate  letters.     Or  you 
can  maintain  a  (relational)  database  of  text  objects,  and  use  an  expert  sys- 
tem or  a  table  as  a  client  to  assemble  the  components  of  a  document.  Saros 
Mezzanine  is  basically  a  SQL  Server  database  of  DOS  files,   each  listed  as  a 
single  record  in  the  database,  which  can  be  found  by  attributes  stored  in 
the  fields  of  each  record.     (The  files  themselves  are  stored  outside  the 
database,  and  incorporated  only  by  reference.)     Reach  Networks  uses  a  data- 
base to  maintain  a  highly  structured  and  linked  set  of  text  files. 

And  then  there's  Lotus  Notes,  which  uses  a  tightly- coupled  client-server  ar- 
chitecture:    The  client  knows  the  server  data  structures  intimately,  and 
vice  versa.     The  benefit  is  that  you  can  get  specific  pieces  of  text,  ar- 
ranged in  specific  ways  such  as  outlines,  tables,  and  chronological  lists. 
You  get  the  benefits  of  distributed  access  within. a  well-defined,  homogene- 
ous environment,  but  you  lose  the  opportunity  for  access  from  heterogeneous 
systems.     It's  the  usual  trade-off  between  functionality  and  generality,  as 
with  applications  written  with  SQL  supersets.     They  use  a  common  format  for 
specifying  the  data,   but  the  applications  themselves  are  platform- dependent . 

As  noted,  the  goal  is  to  have  a  protocol  that  can  keep  the  iront-end. and_ the 
back-ends  independent  of  each  other.     (We  ignore  the  need  for  communications 
standards  to  establish  contact  in  the  first  place .     They  are  important  and 
necessary,  but  not  relevant  to  this  discussion.     It's  assumed  that  you  can 
establish  a  link,   and  that  you  have  the  proper  authority  and  scripts  to  log 
on  to  any  given  service.     Standards  here  would  also  be  handy,  but  they  are 
another  issue . ) 

Three  contenders 

The  three  significant  standards  efforts  in  this  area  are  immature  and  not 
widely  known  or  effectively  promoted-     Each  reflects  the  biases  and  needs  of 
its  originating  community  -     You  may  be  able  to  create  a  standard  fay  cow- 
mitt  ee  ,  but  you  can  get  it  adopted  only  through  vigorous,  effective  market - 

_ .  VjTr  people  with  vested  interests  who  make  more  than  token  efforts  to 
reach  broad  markets.     Where  are  the  3Goms  and  Oracles'-  for  these  stanoaros , 

rc  sev  nothing  of  th-e  IB-Ms  and  Xntels"    Will  Slate  or  someone  else  seH 
WATS     SFQL  and  CX>-R£s  clients  for  PenPoint;  machines" 


rT-  Xerox  DE"  snc  Intel  Our  3 Coir,  was 
s  access±£~^^zr         everyone.     SQ1.  was 

_r   former   zne  bas^z   of  Oracle's 
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Hm»y  proponemnr^r  each  standard  are  barely  aware  of  the  others.     In  part, 
this  reflects  the  gulf  between  the  on-line  and  the  CD-ROM  communities  a 
gulf  which  itself  reflects  the  immaturity  of  the  whole  field.  Basically, 
the  on-line  T^eople  work  ^with  dynamic,  continuously  updated  text  and  focus  on 
content  search  (with  some  exceptions  in  the  case  of  legal  databases),  and 
the  CD-ROM  people  work  with  fixed,  periodically  updated  texts  with  carefully 
architected  structures  and  links.     Thus  it's  appropriate  that  the  content- 
oriented  WAIS  standard  come  from  the  library/on-line  community  and  is  based 
on  its  Z39.50  protocol  for  electronic  card  catalogues,  while  the  structure- 
oriented  SFQL  approach  comes  from  the  CD -ROM/hypertext  world  of  aircraft 
documentation.     The  third  proposal,  CD-RDx,  also  CD-ROM- oriented ,  is  spons- 
ored by  the  intelligence  community  for  use  on  CD-ROMs  with  many  varieties  of 
data  structures  and  types.      (With  the  requisite  plumbing,   the  CD-ROM  proto- 
cols could  of  course  be  implemented  for  on-line  access,   and  vxce  versa.) 

Each  group  needs  to  expand  outside  its  own  community  --  WAIS  from  the  re- 
search/Internet community  to  commercial  on-line  services,  SFQL  from  the 
aerospace  industry  to  other  commercial  communities  that  could  set  industry 
data  standards  (insurance  contracts?  mortgages?  construction  plans?) ,  and 
CD-RDx  from  government  and  a  single  vendor  to  commercial  data  suppliers. 


( 
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The  goal  of  all  three  is  to  allow  any  client  to  retrieve  text  from  any  serv- 
er by  using  a  simple  protocol  to  specify  texts  by  content,  criteria  or  asso- 
ciation, not  by  specific  identity.     The  SFQL  approach  envisions  a  world  of 
specific  domains ,  where  everyone  is  talking  about ,  say ,  airplane  parts ;  data 
structures  and  relationships  are  defined  industrywide,  but  implemented  dif- 
ferently on  each  server.     The  WAIS  approach  is  more  general  and  works  across 
domains  but  without  the  power  of  SFQL;   it  could  be  used  arbitrarily  for 
searches  across  a  wide  range  of  Internet  servers,  news  services,  public  or 
private  databases,   and  possibly  into  SFQL  servers  with  alternate  front-ends. 
(An  SFQL  server  would  work  in  front  of  an  unstructured  text  database,  but  it 
would  be  wasteful.)     CD-RDx  can  handle  either  kind  of  data,  using  full-text 
search  as  necessary,  but  is  implemented  for  use  with  CD-ROMs. 

Thus  these  standards  aren't  so  much  competing  as  oriented  to  different  but 
still  overlapping  tasks.     One  standard  would  be  good,  but  insufficient;  two 
or  even  three  complementary  standards  would  be  much  better.     Twenty-nine  (or 
is  it  37?)   "standards,"  the  situation  we  have  now,   is  a  waste. 

¥AJS:  M&NY  WAYS  TO  DO  IT 

WAIS  is  pronounced  "ways"  and  stands  for  Wide -Area  Information  Servers.  The 
"Wide-Area"  aspect  is  secondary  to  (or  easier  to  achieve  than)  the  promise 
of  heterogeneous  access.     WAIS  is  a  project  of  four  groups:     Thinking  Ma- 
chines, the  instigator,  as  a  follow-on  to  its  work  with  Dow  Jones  that  cre- 
ated a  text  server  for  DowQuest  (see  Release  1.0,   1-88) ;  Dow  Jones  News  Re- 
trieval, a  content  supplier;  Apple  Computer,  focused  on  the  interface;  and 
KPMG,  a  highly  involved  user.     The  project  leader  is  Brewster  Kahle,  a  co- 
founder  of  Thinking  Machines  and  also  a  virtual  employee  of  Apple ,  where  he 
spends  a  lot  of  time.     The  single  greatest  problem  with  this  project  as  a 
standards  effort  is  that  it  is  being  developed  by  a  tight  group  of  dedicated 
people;  they  tend  to  forget  that  they  are  trying  to  develop  something  won — 
derful  rather  than  something  general.     However,  there  are  now  a  lot  of  inde- 
pendent third  parties  using  the  WAIS  source  code  to  create  WAIS  servers  and 
clients  at  some  150  universities,  and  27  WAIS  databases  newly  available  over 
the  Internet  (toe  new  to  draw  many  conclusions  from) . 

What  is  still  missing  is  commercial  commitments,  but  things  look  promising. 
Dow  Jones  is  evaluating  the  WAIS  pilot;  KPMG  found  it  extremely  useful  but 
doesn't  have  a  wide-area  network  to  use  the  service  on  a  broad  basis.  Mead 
Data  has  participated  in  the  implementation  committee  and  is  working  on  a 
WAIS  prototype ,  but  with  nc  firm  plans  for  it  sc  far .     "We  need  to  nave  a 
published  external  interface  for  Mead's  Nexis  commercial  news  and  informa- 
tion"  (but  not  necessarily  its  structured  Lexis  legal  service) ,   says  senior 
architect  Peter  Ryall  .     Other  on-line  vendors  such  as  Dialog  and  CompuServe 
aren '  t  active  sc  fax      Pandora  Systems  ,  a  gm«i  i   consulting,  firm  specializing 
in  on-line  access,  plans  to  build  a  GeoWorks -based  WAIS  front -end,  nicknamed 
the  "cyberspace  cockpit  . "     His  goal  is  to  xa±mic  the  apple  iaterfsce  (with 
permission;  and  extend  ir  with  facilities  for  managing  access  and  filters 
for  Internet  news  groups .     Also ,  NeXT  plans  to  incorporate  WAIS  as  uart  of  e 
broader  information  strategy  which  will  include  strucrurec  searches  as  well 
as   the  pure  SAIS  na  rural  -  language  approach.     IteZT        aireaay  using  i. 
protoryp*  rc  wori-  or.  access  rc  a  variety  of  sources     news  feeds  and  rela- 
tional databases     ssvs  NeZTT's  Adam  Hertz 
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The  WAIS  project  itself  is  focused  on  providin^idiot^proof 
«ia*e«  access  to  text,  while  the  protocol  standard  is  intended  to  support  a 
variety  of  query  methods ,  including  Boolean  or  conceivably  SFQL  (below)  . 
The  general  part  of  the  system  is  a  small,  simple  protocol,  basec  on  a 
liirary-community  ANSI-NISO  (American  National  Standards  Institute-National 
information  Standards  Organization)  standard  called  239.50-1988  (also  pro- 
ceeding within  the  International  Standards  Organization  as  DIS  10162  and  DIS 
10163,  but  nicknamed  SR-1  for  Search  &  Retrieval). 

Type  1,   the  only  subset  of  Z39.50  defined  so  far,   is  Boolean  retrieval  typ- 
ically applied  against  an  electronic  card  catalogue,  not  against  the  full 
text  itself.     Active  proponents  of  Z39.50,  defined  in  1988  but  just  now  com- 
ing into  use,   include  just  about  the  entire  US  research  library  community  -- 
the  Library  of  Congress,   the  Online  Computer  Library  Center  (an  early  user 
of  Tandem  machines) ,  the  Research  Libraries  Group,   Carnegie -Mellon ,   and  the 
University  of  California. 

Z39.50  gets  a  makeover 

WAIS  is  a  superset/subset  of  Z39.50   (originally  defined  as  Type  3  but  now 
probably  going  to  be  an  extension  of  Type  1) ,  with  some  subtle  changes  to 
broaden  its  reach  and  eliminate  some  of  the  powerful  but  restrictive  fea- 
tures of  the  original.     These  extensions  are  likely  to  be  adopted  by  the 
NISO  committee  and  merged  back  into  the  Z39.50  standard.     Clifford  Lynch  of 
the  University  of  California's  Division  of  Library  Automation  is  a  key  per- 
son in  the  Z39.50  effort,  and  is  also  tracking  the  WAIS  project  closely  as  a 
leader  in  the  NISO  committee  shepherding  Z39.50's  evolution. 

Where  Z39.50  was  originally  designed  to  search  electronic  catalogues,  re- 
turning a  list  of  titles  and  document  IDs  so  that  you  could  then  select  the 
ones  you  wanted  from  a  list,  the  WAIS  approach  is  more  oriented  to  full-text 
and  even  multi-media.     (For  multi-media,  the  search  routines  look  for  text 
associated  with  the  non-text  items,  which  are  retrieved  separately  by  IDs . ) 
Thus  Z39.50's  Boolean  searches  of  defined  fields  in  a  card  catalogue  (or  any 
other  document)  are  still  possible  but  are  no  longer  an  integral  part  of  the 
spec,  which  passes  through  arbitrary  strings  for  full-text  search  as  a  least 
common  denominator . 

Moreover,  while  the  original  Z39.50  server  maintains  the  "stare"  of  the  ses- 
sion       i  e      it  knows  what  documents  it  has  listed  for  the  user  and  can 
then  select  those  he  picks  from  the  list  --  the  WAIS  spec  requires  the  cli- 
ent to  maintain  that  list .     Then  the  client  sends  back  the  precxse  IDs  of 
the  documents  he  wants  searched  to  select  parts,  or  to  retrieve  m  full. 

The  benefits  are  that  a  single  server  can  handle  a  number  of  clients  more 
effectively    since  the  server  handles  each  client  transaction  by  trans- 
action,  and  that  documents  identified  by  unique  ID  in  one  transaction  can  be 
used  in  a  querv  to  another  server  as  well  as  to  the  original  one.     The  WAIS 
protocol  also  includes  an  optional  procedure  for  relevance  feedback,  whereby 
vcm  can  send  e  document  ID  anc  oprional  subletting  parameters  (paragraphs, 
mse  of  bytes,  etc.  )  ,  wfcLch  is  transformed  into  a  docweat  by  the  system  as 
the  text  cf  £  suery      Exactly  ho*-  the  document  gets  from  server  tc  server 
{and"i^"palc  for.   if  necessary;    is  an  exercise  left  tc  rh*  systems  imple- 
menter ,  but  logically  it  is  possible . 


Rej.ea.se 
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Sending  a  Message 

r  : 

The  protocol  transmits  text  strings  to  ..search  for  and  specifies  where.  It 
can  also  handle  instructions  for  which  fields  to  search  or  Boolean  con- 
straints or  relationships  among  words  --  how  close  together  they  must  be, 
ands  and  ors  and  nots ,   as  well  as  criteria  such  as  date  of  publication,  au- 
thor, publisher,  type  of  publication,  headlines  or  abstract,   or  within  the 
full  text.     It  supports  Boolean  constraints  and  criteria  explicitly  but  op- 
tionally; it  could  also  support  almost  any  other  format,  including,  in  ex- 
tremis, a  phrase  that  said  in  effect,   "now  speaking  SQL:"  which  would  alert 
an  SQL  server  at  the  other  end  to  turn  on  its  SQL  parser.     Other  systems 
would  simply  interpret  the  words  in  the  SQL  query  as  words,   and  do  their 
best  to  find  relevant  texts  according  to  their  own  methods.     In  fact,  you 
could  even  use  WAIS  for  actions,   such  as  ordering  reprints,   although  not 
formal  transactions   (at  least  as  far  as  WAIS  is  concerned) . 

The  WAIS  protocol  allows  any  client  and  any  server  to  communicate  without 
crashing.     Thus,  in  a  natural -language  query,  there  could  be  a  lot  of  ex- 
traneous stuff:     "I'm  wondering  how  come  OS/2  seems  to  get  such  a  rotten 
deal  in  the  press."     Or,   "I'd  like  to  know  about  poems  about  Alice  Haynes  by 
Juan  Tigar."     On  the  other  hand,  a  structured  query  could  use  defined  fields 
unintelligible  ("author,"  or  "to"  and  "from")  to  the  server  that  receives 
them.     In  practice,  you're  unlikely  to  query- a  news  database  by  "addressee," 
as  you  might  a  mail  server,  but  if  you  did,  the  news  database  would  simply 
ignore  the  "to"  field. 

The  protocol  itself  carries  no  high-level  notions  of  relevance,  concepts, 
categories  or  structure;  the  interpretation  happens  on  either  side  (just  as 
with  SQL  there's  complex  data  structures  on  one  side  and  complex  application 
and  display  logic  on  the  other).     This,   of  course,   is  where  WAIS  is  likely 
to  meet  its  strongest  objections  --  from  people  who  say,   "Well,  my  front-end 
can  do  a  lot  more.    Why  should  I  dumb  it  down  for  this  system?*"    In  fact; 
WAIS  can  pass  through  intelligent ,  structured  queries  as  well .     Not  even 
stop  words  are  removed,  so  that  you  can  have  two  interdependent  systems  com- 
municating with  each  other  unknown  to  the  WAIS  protocol.     Matched  clients 
and  servers  work  better  in  concert,  of  course,  but  all  can  work  together  to 
some  extent.     The  goal  is  for  all  these  approaches  tc  comoete  on  a  playing 
field  leveled  by  WAIS. 


How  does  WaIS  compare  with  Xanadu,   the  information  server 
designed  by  Ted  Nelson  and  now  owned  by  Autodesk?     (See  Release 
1.0 .   7-89).     To  the  naked  ear,   they  sound  alike.     Rut  they 
aren't.     Xanadu  is  a  server;  it  mainta ins  close  control  over 
tie  content .  and  is  a  way  of  publishing  and  assembling  info  and 
managing  it  at  a  more  granular,  ID- oriented  level.     mirth  Xana- 
du ,  you  specify  or  foliar  links  to  get  The  precise ,  unique 
thing.     WAIS  is  s  wst  of  finding  and  distributing  informarior. 
~zis~  ziaz  z.+res.z~-'  t>-£±eri  puL^iSirec  zx.  c  variety  oz  zormats .  Wzzr. 
WAIS  .   vou  describe     and  ?er  =  number  cf  Dossibzlzties  Cr" 
course     vol  cc, — z  nave  a  Xanadu- soeczf -t  W-AIS  front  -  enc  zc 
Xanazu     duz  _r  vol.  aoaressec  Xanadu  wzzr  tne  WAIS  aefauzt 
natura^ -  language  cuerr  vol:  would  lose  Xanadu  '  s  full  power 


The  server  respond* 

The  server  makes  its  best  effort  to  answer  the  user's  query  and  sends  back  a 
list  of  texts ,  identif  ied  fully  According  to  the  WAIS  syntax    with  an  ID  a 
title    score    types  and  date.     (The  ID  includes  the  originating  source  the 
copyright  owner    and  a  unique  ID,  as  well  as  the  server  supplying  the  docu-^ 

iS  trS' given  it  by  that  server.)     The  user  "^Ttne  agents 
list  to  receive  the  full  content  (or  a  specified  subset)  of  the  documents 
listed,  or  he  can  refine  or  modify  the  query  (with  relevance  feedback  or 
other  constraints). 

The  documents  are  listed  by  title  (either  a  specified  title  or  the  first 
line  of  text  by  default),   in  order  of  their  scores.     The  scores  measure 
relevance,  according  to  algorithms  that  may  vary  from  server  to  server^  On 
a  Boolean  server,   that  might  simply  be  the  number  of  ^«  *  ^ 
appears  in  a  document,  or  the  number  of  times  it  appears  d^&d ^^.^ 
ber  of  words  in  the  document,   or  it  might  be  a  1  for  "present";  on  .  Think- 
ing Machines  server,   it  might  be  a  complex,  proprietary  ranking  that  in- 
volves  weights,   co-occurrences  of  words,   etc.    (see  Release  1.0,   1-88  and  3- 
90)       The  type  defines  the  document's  format  -  TEXT,  PICT,  TIFF,   etc.   -  an 
extensible  list  that  could  include  spreadsheet  files  or  voice  annotations. 
WAIS  has  already  extended  Z39  to  handle  multimedia  by  handling  larger  fries, 
parts  of  files,  and  "understanding"  the  vagaries  of  graphics  and  potentially 
sound  or  video  formats.     Obviously,  the  client  needs  the  appropriate  racxl- 
ities  to  represent  the  objects  retrieved  to  the  user,  but  the  protocol  it- 
self can  handle  anything  digital. 

Another  defined  type  is  WSRC  (for  Wais  SouRCe) ,  which  includes  IDs  for  docu- 
ments located  elsewhere  and  instructions  for  connecting  to  the  other  serv- 
er(s)  where  they  are  located  -  i.e.,   a  sort  of  incorporation  by  reference. 
That  means  one  server  can  act  as  an  index/pointers  for  others  --  or  a  yellow 
pages,  if  you  will.     WAIS  also  offers  a  standard  way  to  describe  servers^ 
L  terms  of  its  contents,  a  server  can  describe  itseXf  in  answer  to  a  WAIS 
full- text  query,  but  other  information  is  useful  too.     For  example  what 
protocols  do  you  support?    What  networks  are  you  on?    Who  owns  you?  Where 
are  your  documents  from  and  how  frequently  are  they  updated?    And  of  course, 
what  are  the  charges?    The  description  of  servers  is  one  good  place  to  in- 
clude pricing  information,  although  some  documents  may  be  priced  individual- 
ly.    (You  might  even  be  able  to  run  a  remote  interface  to  American  Informa- 
tion Exchange ,   Release  1.0.    ~ -  90  .  ) 

How  does  the  refinement  of  the  query  relate  to  the  first  version?     In  a 
Boolean  system,   it  could  be  the  addition  of  "and  not  Paris."     In  a  ■«■« 
sophisticated  one.   "before  1985."  referring  either  to  dates  within  the  text 
(although  the  system  might  also  pick  up  -Section  1203"  or  "1625  feet")  or 
the  date  of  publication  of  the  text  to  be  retrieved .     In  another  system,  i. 
might  be     "more  articles  like  the  third  one  you  selected,  but'  nothing  like 
the  first  on  the  list-  {*iiich  concerns  a  d±fx*r«a:  Alice  Haynesj  .  ****** 
case    tne  second  query  consists  of  all  the  words  in  the  selectee  document 


Behind  the 


news  groups .  mai_  ar - 
Lnding  things   -  -  from 
CoroectioTltachi^e"s  brute-force  string -searches  tc  full -text  indices  tc 


Annotation  is  Mtt>|Hwted 
through  Ihil  palette  of 
tools.  1  he  u*er  ti  given 
access  to  (from  lop  to 
bottom)  "PoSttfl"  nbtta 
that  can  hold  twrl  dati, 
a  S|w<  inl  type  of  Posted 
that  cnn  siore  midlo  in- 
notntitms  and  A  number 
of    C'llflfd  highlight 


FIND 


The  "Find"  butlort  Mid 
"next'  and  Srsvloits" 
arrows  allow  trie  ulcr  to 
look  for  daw  br»4eq  b«  I 
number  of  cfwradteris-  " 
tics.     The  liser  can 
search  Tor  pwllciilat  tt*t , — 
string!:,  in  addition,  the  '■ 
user  can  select  to  nenrch 
for  earlier  or  later  In- 
stances   of  particular 
highlight  colors,  Tott- 
err  notes  or  ttirlio  anno 
H(ion« 


This  central  portion  contains  the  "content"  of 
the  notebook  -  I.e.  the  actual  data  that  was 
retrieved  by  the  user. 


Tto  r  MtlufliMi  |  wv  urAm,  vMtli  h«  haaa  aarforml  raj 
¥>lltlrWWjltr».»»aaalnfc*OTbar  l««l,vtlltiaflarM 
la  HKUIaatlafwrt. 


Ur«(.»Ma^i*a<UMiHnt. 
tftiMNM'pM|*|.<1ftU1,lll 


3»Mrtt  Mknn  li  II*  award  a.«artar  a(  IMa  »jaar."Yfi  <ri 
laMaa,  aw  twtamart  tfiH  anltitr  MmmUii  at  Wnn- 
nwitrtti*  •  -  m  tavar  at  in  a*Hi  vtrir  Mi  *» 

 ~"  Int.  "*»Mw*Mtlt 

Irataatrt  Km  va  lnatamanM 

  .  1,(lltr-l|«lll»t<»rt.'1l» 

lt«r*4w*W*«rvMat  raa*a1ltaUn*<fcr«tMataar 
MfflitftMM  H  itnMi  tM  tnar.-  1MI  nat  tt  lag*  k  nil 
*  an*. TM  111  <M  *1l  apaa  III  teknaMn *»r  la 10 
JarMtwaainara  ll  \h  mrt  vatea. 

s>rtrt  *f»  INtr  awtitar*  will  lava  atari  atn 
•MV>k)kr»»«lMfrMitM  wta-aeNvataal  trml  art 
tM  al*rt  tarn  |ati»  vltt  la  ratlarW  la  nod  twrttrl 
aart<«aja,(ln!aelala»l|(lrtiHa<lri. 


Trol<»<      design  for  Information  "notebook."  This  screen  depicts  a  notebook  in  which  a 


This  "bird's  eye  view'  of 
the  notebook  allows  the 
user  to  see  a  visual  map 
of  items  in  the  vicinity  of 
the  current  location.  The 
large  arrow  marks  the 
current  location;  the  sires 
of  annotations  are  exag- 
gerated. The  user  cart 
quickly  see  that  two  inv 
ages  are  immediately 
'above'  the  current  loca- 
tion, a  highlighted  pass- 
sage  is  located  farther 
'above'  and  a  "Posted" 
note  is  located  'belo*,' 
This  view  can  also  be 
used  as  a  navigational  de- 
vice -  by  clicking  on  the 
desired  location,  the  note- 
book content  jumps  to 
that  location. 
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A  hierarchical  outline  al- 
lows the  user,  in  this  case, 
to  view  the  contents  in 
chronological  order.  The 
user  can  expand  the  budine 
(e.g.  'open  a  year  into  its 
months)  or  use  it  as  a  navi- 
gational device  to  jump  to 
a  particular  section  of  the 
notebook.  The  user  can 
also  change  the  notebook's 
organization  by  selecting  a 
new  attribute  from  the  "Or- 
ganize by"  menu  at  the  top 
of  the  column. 


can  skim,  search,  organize  and  annotate,  information. 
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lists  of  articles  and  abstracts  to  a  bulletin  board  of  text  Items  identified 
by  keywords  and  classified  into  categories  or  news  groups  automatically  or 
by  a  sysop.  or  selected  as  "editor's  choices"  by  someone  you  reyere  You 
could  also  have  employee  handbooks,  automated  help  systems,  on-line  docu- 
mentation, library  catalogues,  a  database  of  patents  with  numbers  and  key- 
words and  drawing*,  and  so  forth.     The  classification  scheme  could  be  any- 
thing  from  an  alphabetical  list  of  words  (a  plain  index)  to  a  hierarchy  such 
as  Verity's  Topic,  tailored  for  a  certain  subject,  to  a  chronological  file 
of  mail  messages  to  a  highly  structured  text  database  such  as  Lotus  Notes . 

The  WAIS  project 

The  WAIS  project  comprises  a  number  of  separate  interoperating  installa- 
tions,  including  a  loaner  Connection  Machine  at  the  KPMG  New  Jersey  head- 
quarters office  that  has  now  been  returned  to  Thxnking  Machines.     KPMG,  the 
primary  nontechnical  user,  experienced  all  the  benefits  other  accounting 
firms  have  experienced  with  Notes  and  the  Reach  network  (see  Release  1.0,  2- 
91)-     better  and  more  up-to-date  information,  better  sharing  of  client  con- 
tacts and  corporate  knowledge ...  overall  a  sort  of  automation  and  broadening 
of  the  old-boy  network. 

The  user  interface,   "Rosebud,"  was  developed  by  Apple's  Advanced  Technology 
Group    based  on  its  earlier  work  on  the  interface  on  the  Dow  Jones  DowQuest 
system      It  allows  users  to  type  in  natural  language  queries  and  to  mark  up 
the  replies  as  yes,  no,  maybe,   and  select  parts  that  are  of  particular  in- 
terest.    Those  texts  then  constitute  the  basis  of  the  second  query  (as  sup- 
ported by  the  protocol).     Rosebud  also  includes  some  added  features,  as 
shown  on  the  previous  page.     (This  is  from  a  paper  Apple  presented  this  week 
at  the  SIGCHI  human  interface  meeting  in  New  Orleans.)     Another  idea  de- 
scribed is  a  "newspaper"  which  consists  of  a  laid-out  set  of  responses  to  a 
set  of  queries  that  are  run  daily:     Thus  each  day  you  could  get,  for  exam- 
ple   software  news  in  the  upper  right-hand  corner;  John  Sculley's  aaily  ac- 
tivities in  a  box  at  the  lower  left;  lacrosse  on  the  left;  and  any  mention 
of  your  own  name  featured  in  boldface  type  on  top  in  the  center. 

The  back-ends  are  Connection  Machines,  which  perform  high-speed  parallel 
string  searches  and  matching  algorithms  to  retrieve  the  texts  most  relevant 
to  each  emery .     Other  WAIS  servers,  such  as  those  at  universities,  mostly 
use  serial -search  text  engines  and  indexes.     The  WAIS  server  software  will 
also  shortlv  be  installed  on  existing  Connection  Machines  at  Xerox  PARC,  at 
a  shared  sire  at  Savior  and  Rice  Universities,  and  some  other  places.  You 
can  buv  your  own  starter  set  for  about  $150,000.   software  induced. 


Like  other  client -server  architectures,  WAIS  offers  economies  of  scale.  J± 
you're  doing  something  very  smart,  you  can  apply  it  on  the  server  sxde, 
where  anyone  can  use  it  through  WAIS,  rather  than  on  the  client  side  (where 
oniv  a  subset  of  customers  will  buy  it  )       This  assumes  ,   of  course ,  reason- 
able adoption  of  WAIS      Tne  client  -  server  separation  allows  the  mmnimnti  in- 
telligence in  the  model  applied  to  the  texts     and  maximum  access  even  from 
clients  who  don't  know  that  model      Likewise     ir.  genera.     it's  best  for  tne 
nrotocol  to  oass  on  the  query  ir  its  full  ricrmess     rather  than  trying  to 
interpret  it"      Clever  clients  car.  apply  their  cleverness  across  e  multi- 
plicity of  servers . 
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The  user  interface  helps  in  making  the  system  intelligible  to  the  user 
(rather  than  the  user  intelligible  to  the  system,  which  is  the  server's 
job).     On  the  server  there's  complex  text,  and  possibly  text -searching  and 
categorization  capabilities.     On  the  client  side,  there's  a  complex  human 
reader/editor/writer .     But  communication  between  the  two  sides  is  sparse. 
Thus  the  protocol  provides  the  generality,  and  the  systems  on  either  two 
sides  provide  the  richness  and  power . 

Appendix:     Still  on  the  agenda 

Issues  of  security  and  the  like  are  up  to  each  server/service.     So  are  pay- 
ments.    Specifying  costs  is  not  yet  part  of  the  protocol,  although  this 
information  can  ride  along  through  it.     There  are  a  number  of  possible  pric- 
ing algorithms        by  time  and  time  of  day  or  week,  by  length  or  identity  of 
items  found  or  delivered,  with  charges  potentially  varying  from  document  to 
document  as  well  as  server  to  server.     Although  many  of  the  people  spear- 
heading this  effort  are  of  the  free- information  camp,   it  is  vital  for  the 
spec  to  be  broadened  to  include  a  way  to  specify  charges.      (They  know  this; 
they  just  forget  it  when  they  get  excited.) 

Pricing  information  would  make  the  protocol  useful  not  just  to  libraries 
(which  also  need  to  cover  their  costs ,  rather  than  restrict  access  to  other 
member  libraries)  but  also  to  more  commercial  services  such  as  those  of  Dow 
Jones ,  Reuters ,  Mead  Data  and  hundreds  of  potential  information  suppliers 
who  will  be  drawn  into  the  broader  market  WAIS  could  foster.     Rather  than  be 
a  subscriber  to  a  specific  service,  with  an  account  name  and  a  specific 
piece  of  front -end  software  acquired  along  with  the  subscription,   one  could 
be  anyone  with  a  valid  credit  card  number  --  and  some  positive  identifica- 
tion,  of  course.     The  adoption  of  the  WAIS  standard,   in  fact,   could  be  an 
important  factor  in  the  blossoming  of  the  Electronic  Frontier,  with  informa- 
tion traded  freely  (but  not  for  free)  among  a  wide  community. 

Free  services  can  also  be  part  of  the  same  network.     Indeed,  we  believe  a 
properly  competitive  market  will  include  both  free  and  fee  services .  One 
early  service,   of  course,  will  be  a  server  of  servers   (Thinking  Machines  al- 
ready offers  one)        an  information  service  listing  where  you  might  want  to 
search  for  certain  kinds  of  information.     Instead  of  texts,  it  will  respond 
to  queries  with  the  names  of  likely  servers  for  the  information  desired,  in 
a  format  that  the  front-end  can  present  to  the  user  to  select  from  for  the 
search.     (Pricing  information  will  be  included.)     A  smarter  server,  with 
pointers  tc  the  best  arricies  on  a  particular  topic  --  -ba.sica.lly,  -a  selec- 
tion editor  as  opposed  to  a  copy  editor  --  could  charge  for  its  services. 
(See  Balmse  1.0,   7-89,   on  hypertext  publishing.) 

There  are  also  phys icai  connection  issues  co  resolve      Those  can  be  handled 
by  rhe  client,  which  either  will  "nave  the  numbers  of  the  s servers  desired,  or 
know  how  to  reach  them  over  some  internal  or  external  mail  network.  Remem- 
bex  that  THAIS  is  a.  spec ,   the  imp  i  giMmrjT i  m  i  i  c  wiil  vary  t-y~m<w**»«yi  y , 

lr  simple  make-i  it  Dossli Ie  icr  svstems  tr  iate r ooei att  tret:  the  nnderDir: - 
nings  have  to  be  there .  (Most  of  these  issues  also  apply  if  the  other  two 
standards  are  iisec  tc  communicate  with  or: -line  servicei. 

The  sequel 

The   tonscrtttnr         ;r  ratner     rr.t    iricnii.  -sTZ-azz   rea:    aer.mc  WAIE  nasr  t 

vet  oegKt  ar.v  zcnr.a^   errcrt;    tc   tromctt    it         Consiaer   our    tovexage   one  of 
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the  first  such  moves.)    Accordingly,  there's  no  grounds we 11  of  support  yet. 
A  few  vendor*  are  aware  of  the  project,  but  most  aren't  au  courant .  Many 
consider  it  e  proprietary  effort  on  the  part  of  Thinking  Machines  and  Dow 
Jones,     "They  love  the  natural - language ,  relevance- feedback  approach,  of 
course/'  said  one  person  we  talked  to,   "because  it  takes  a  lot  of  machine 
power  and  Thinking  Machines  can  do  it  better  than  anyone  else,"  Although 
the  protocol  allows  for  intelligent  searches,  the  hearts  of  this  group  are 
definitely  with  the  naive  user . 

But  all  a  standard  needs  is  a  broad  front,  not  necessarily  a  consistent, 
united  one.     While  the  other  two  standards  efforts  described  below  are  also 
significant,   the  role  of  WAIS  as  a  means  to  communicate  in  almost  real-time 
among  people,  rather  than  access  to  prepared,  edited,   structured  data 
sources,  makes  it  of  more  social,  political  importance  than  the  other  two. 


SFQL:  WHEN  STRUCTURE  COUNTS 

The  chief  advantage  of  WAIS  is  its  breadth  and  adaptability.     It  is  also 
neutral;  you  can  pass  intelligent  messages  across  it,  but  it's  unaware  of 
them.     A  different  approach  is  that  of  SFQL,  which  allows  for  independent 
clients  and  servers ,  by  allowing  them  to  communicate  formally  about  the 
structure  as  well  as  the  content  of  the  data.-    (Or  they  may  share  a  common, 
standard  data  schema  specified  by  an  outside  authority,   such  as  a  trade 
group  or  anyone  who  controls  both  clients  and  servers . ) 

SFQL  is  the  product  of  a  group  of  airline  and  aerospace  companies  and  their 
vendors.     It  was  driven  by  their  need  to  publish,  maintain  and  retrieve  doc 
umentation  for  aircraft,  which  have  components  (most  notably  airframes  and 
engines)  from  a  variety  of  suppliers.     One  early  effort  was  a  customer's: 
British  Airways,  KnowledgeSet,  Maxwell  Data  and  Boeing  got  together  to  put 
documentation  for  BA's  Boeing  757  aircraft  onto  CD-ROM  in  1987.  However, 
that  system  is  closed;  i.e.,  you  can't  use  its  software  to  retrieve  any 
other  vendor's  documentation  for  any  other  Boeing  aircraft  --or  any  other 
aircraft  owned  by  BA.  ) 

The  BA  project  was  one  of  the  first;  now  this  problem  has  become  increasing 
ly  apparent.  It's  aggravated  because  engines  and  airframes  come  from  dif- 
ferent vendors,  and  some  airlines  contract  maintenance  out  to  other  air- 
lines. Typicallv.  you  need  a  separate  system  for  each  supplier,  since  each 
supplier  builds  its  own  CD-ROM  documentation  system  in  conjunction  with  one 
of  several  CD-ROM  preparation  houses.  Moreover,  BA  has  no  wish  to  fund  an- 
other such  project;  presumably,  it  would  like  its  suppliers  to  provide  docu 
mentation  on  CD-ROM  in  a  format  that  could  be  read  by  front-ends  from  a  va- 
rietv  of  competing  front -end  system  providers. 

At  the  instigation  of  the  Air  Transport:  Association  and  the_^erospace  In- 
dustries Association,  a  committee  of  customers  and  vendors  for  both  equip- 
ment ATxa  software  documentation  systems  gor  rogether  to  come  up  with  a  sran 

dare  for  interoperab  iiitrr  --   and  tve  separart ;   inter  operaci-t  ojspXemenr.fi  - 
tions .     The  group  includes  software  vendors  Context  Corporation.,   EDS,  Ful- 
crum.   IBM.   KnowledgeSet.   Maxwei-  Data  Management  anc  TMS     ATA  members  Amer 
icar.  fi.ir.mej   anc  British  Airways     and  AIA  member i  Aerospatiale  Boeing. 
Douglas  anc  GE 
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WbMt  is  SfQET 

SFQL  stands  for  Structured  Full -text  <Query  Language ,  based  on  a  subset  of 
SQL  (Structured  Query  Language) .     It  leaves  out  relational  database  func- 
tions such  as  dynamic  updates,   joins,   transaction  management,   dynamic  view 
definitions  and  subqueries  which  don't  (for  now)  seem  relevant  or  cost- 
effective  with  text  databases.     The  premise  --  and  power  --  of  SFQL  is  that 
the  text  being  searched  does  have  some  structure,   including  such  things  as  a 
title,  an  author,   an  abstract,  headings  and  subheadings   (which  can  be  called 
out  to  produce  a  table  of  contents),     There  may  also  be  cross-references  be- 
tween items,  a  topic  index,  versions  and  updates. 

Full-text  search  is  probably  both  too  broad  and  too  vague  to  handle  these 
kinds  of  queries.     Full-text  search  with  relevance  is  quantitative,  whereas 
with  SFQL  you  can  get  precisely  the  right  references  --  rather  than  enough 
information  to  satisfy  curiosity  or  a  query.     Compare  the  concrete  rela- 
tionship of  a  bolt  to  the  fan  it  attaches  to  an  engine,  and  the  vaguer,  dis- 
creet connection  between  Juan  and  Alice  (they  co-occur  a  lot,  but  their  ex- 
act relationship  is  unknown  --  and  keeps  changing).     Moreover,   SFQL  can 
build  (project,   in  relational  terms)  new  text  structures:     You  may  want  dif- 
ferent subsets  depending  on  whether  your  plane  has  two  galleys  or  extra 
first-class  seats. 

Thus,  SFQL  implicitly  turns  the  text  into  sets  of  tables,  where  each  item  is 
a  record  with  a  multiplicity  of  fields  of  arbitrary  length  (below) .     Just  as 
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vOU  can  create  a  hierarchy  from  tables  shoving  which  items  fall  under  which 
otter  item!,  so  can  you  create  a  text  database  ^^  cross-references 
components  and  so  forth.     Then  you  can  use  a  superset  of ^QL  "  with  the 
important  concepts  of  "CONTAINS  a  string ,  -  ^^'V"^,^  ^ 
document,  and  proximity  of  one  term  to  another  added  --  to  search  it. 

The  initial  version  of  SFQL  dealt  with  the  text  as  a  simple  concatenation  of 
v^riaSe-Sngth  fields  in  a  lengthy  record;   it  supp  orte 

criteria  and  full-text  search  within  any  or  all  fields  (  contains...).  The 
newer  version,   SFQL2 ,  now  in  final  revision,   can  handle  the  more  subtle  (and 
«™™nriate  to  structured  documents)  notions  of  hierarchies  and  components 
STSiSoSnS        though  the  schema  is  still  maintained  as  tables  not 
as  a  loeical  hierarchy.     That  is,   a  paragraph  is  also  part  of  a  chapter,  any 
text  can  contain  a  variety  of  separately  specified  fields  such  as  part  names 
or  diagrams,  cross-references  can  be  maintained,  and  » °*  °£*£*h 
headings  can  also  be  viewed  as  a  table  of  contents.     It  all  has  to  do  with 
the  ability  of  SQL  (inherited  by  SFQL)  to  create  views,   so  that  the  same 
item  of  text  can  be  seen  as  itself,  as  part  of  a  chapter,  or  as  a  collection 
of  Subsections.     Headings  can  be  collected  into  ^  view  as  a  tab  e  of  con 
tents,  and  cross-references  can  be  maintained  as  fields  m  yet  other  tables. 

Vendors  two 

The  original  SFQL  concept  and  spec  were  developed  at  GE's  Corporate  ^search 
Ind  Development  Center  by  Neil  Shapiro,  now  an  independent  consu ^ant  with 
Ms  owl  firm    Scilab.     Further  work  on  it  and  SFQL2  was  continued  by  Shapiro 
and  Scm    f  Ottawa  and  KnowledgeSet  of  Mountain  View    CA.     Fulcrum  is 
uniquely  suited  to  this  task,   since  it's  a  long-time  believer  in  clxent- 
server  technology  (its  first  full  client-server  toolset  came  out  late  last 
yeal  after  fou/years  in  development).     The  company  isn't  °££de 
the  text-retrieval  world  because  most  of  its  »  "£d  '^^t  ™ 

such  as  Siemens  Nixdorf ,  HP,  Data  General,  Sun,   ICi..  and  NCR      Thus  it  has 
an  API  of  almost  200  commands,  a  strong  sense  of  openness, 
to  build  a  server  to  implement  the  evolving  specs  of  SFQL.     Fulcrum  gets 
about  half  its  revenues  from  disk-oriented  retrieval  systems     and  halt  its 
revenues  from  CD-ROM  software;   rather  than  consulting,   it  sells  licenses  to 
Itl  engLe  to  publishers  or  data -preparation  houses.     Fulcrum    with  revenues 
of  aDOUt  $5  million  last  year,   is  owned  by  Datamat ,   a  systems  house  (and 
Fulcrum  client)  based  in  Rome . 

KnowledgeSet  brought  to  the  party  its  intensive  experience  with  British  Air - 
Syl  anc  Boeing,  along  with  KRS .   an  engine  and  flexible  toolset  ror  text  _ 
Separation .   «d  a  complete  user  interface.     (Fulcrum  usually  leaves  the  in- 
terface to  its  resellers,  who  integrate  it  with  their  own  offering 
KnowledgeSet  is  CD-ROM-  and  consulting -oriented ;   it  specializes  m  building 
«"-m^agement  systems  to  order.     Somewhat  smaller  than  Fuxcrum ,   it  is  a 
subsidiary  of  Banta  Corp.    (which  has  revenues  or  S660  miliignj . 

KunOadgaSax  sees  SFQL  as  a  way  into  the  aerospace  market^  but  «ot«  jhiefc 

ca-  af*o-c  —  ~s-seuse  without  paid  development  -mrrrar.  i-* 
^  of "income.     For  Fulcrum.   SFQL  -  and  openness  in  general  since  it 

sells  a  naked  engine         is  more  of  *  re.igior.      Tne  company  p.^  ~ 

SF01  it  fc   forthcoming  release  or    its  scrtware 

The  two  implementation  teams     wording  separates     were  Aerospatiale  using 
at  engine  froir  Fulcrum,  anc  GE     using  tne  KRS  engine  rro*  KnowledgeSet  — 
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group  developed  both  an  information  server  with  aircraft  documentation  and  a 
separate  Windows -based  front -end.     In  fact,  GE  built  two  front -ends  --an 
interactive  SFQL  front -end  where  you  would  actually  build  a  query  in  the 
SFQL  syntax,  and  a  forms-based  front-end  that  dynamically  loaded  field  names 
supplied  at  runtime  by  the  server.     Aerospatiale  had  a  forms  -interface  with 
field  names  based  on  the  ATA  100  standard  for  documentation;   it  was  easier 
to  use  but  less  flexible. 


Ready ,   set ,   switch  f 


The  great  moment  came  last  year  at  the  February  AIA/ATA  meeting  in  Washing- 
ton.    Each  team  demonstrated  its  system.     Then  they  switched  disks,  which 
contained  both  data  and  each  team's  server  software   (which  also  ran  under 
DOS/Windows).     They  both  still  worked. 


SGML,  DTDs,  s enemas  and  OODBs 

SGML,  or  Standard  General  Markup  Language,   is  often  described  as  an 
SQL  for  text.     In  fact,  it's  more  like  an  SQL  syntax  and  language 
generator;  markup  is  only  one  example  of  the  possibilities.     That  is, 
SGML  is  a  small,  extensible  language  that  allows. builder -users  to 
build  Document  Type  Definitions  that  describe  the  various  allowable 
components  of  a  specified  document .     The  components  within  a  document 
are  "tagged,"  or  identified  as  various  elements  in  the  DTD,   so  that 
they  can  later  be  manipulated  by  an  application  (for  layout  or  dis- 
play, for  example)  or  by  a  database  engine  (for  selective  publishing" ' 
or  retrieval,  for  example).  . 


Overall,  a  DTD  is  a  framework  for  a  document:     There  are  DTDs  for 

books,  for  documentation  manuals,,  for  government  RFPs  —  hence  the: 
r;«oy«nmeiit  's_  *t»i-»™^7::^*-*^I^ 

Computer-aided  Acquisition-  and  Xogisliic"" Support)  Initiative ,  for 
...  catalogues,  and  for  a  variety  of  other  documents .     The  dex3nitioxui^>~: 

can  be  strict  or  loose  —  four  sections  with  three  subsections  each'l*  :"-.\ 

or  a  preface  and. several  chapters  followed  by  an  index.  __ There 
-  also  be  omtent- specif ic^-^tags^"  such"  as TDs  fox^drug  names  or 
7  names  in  documentation ,  or  -formats  for  ^laentSyisig^legai; 

<Iuestions  vs.  answers .     Figures  can  be  identified  and  linked 

markers,  and  so  forth. 


j     The  specific  framework  for  defining  and  relating  these 

i - ■-  «B»C±t>ttws  a  DTD;     Or  they  cam 

!  q»erifts  ^  »o  that  a  ±aa&ar~caak& 

tiajly ,  SGHL  is  a  Tncti  for  cxmatxag;  TixSkr 
languages,  or  DTDs.     Beyond  that,  you.  can  build  a 
such  as  the  ATA  100  spec,  using  the  elements  of  a 


DTBt-=— 


Toe  could  also  store 
would  maintaim- the 

it  implicitly  as  sets 
tier. .  £.7.  00DE  manages 
niceties  . 


uormwents  in  an  object-oriented  database,  which 

of  tables  recording  the  structure .      (In  addi- 
the  binding  of  methods  to  obiects  and  other 


-  tight  or  loose 


-,    j  fnT  aircraft;  documentation  im- 

The  ATA  Spec  LOO  '"^^       ^1^"  used  more  broadly.     Just  as  an 
planted  in  SFQL.  **™<^^  fa  ^tTd^base  about  the  database  it 
5QL  database  has  a  catalogue  (whicn  is  a  n  aDOUt  the  texts  it 

mLages).  so  does  SFQL  use  a  "anQIrd    Hs  in  AIA/ATA  -  or  it 

— ges^    This  schema  can  be  pa ^fjd\0  any  front.end,  thus 
can  be  built  on  a  single  server  an  front.end  to  communicate  intellx- 

providing  enough  g^^^t    ~     Having  a  standard  scha™ 
gently  with  that  SFQL  bacK  ena  an  ,  front.ends  that  make  access 

|ives  you  the  ^fM»  ^^'S  ^ability  to  define  one 
easy  for  end-users  (as  Aer°!^*^  ^  overall ,  and  means  that  SFQL 

dynamically  S1™^  ™rJ of  information  models  and  domains, 
ultimately  can  address  a  large 

™    n-r  is  a  possible  kind  of,  Document  Type 
This  text  metadatabase  is  close  to, oris  •  ^"'^  world.     See  box.  The 
Definition,  or  DTD.     DTDs  are  we  ™  In  th.  t.  traditionally  had  w  be 
SFQL  server  converts  the  SGML  document  sp      ^  system  to  understand  a 

£l£  l/VZZll  rathar  than  a,  an  i»— «,  .«««»■ 


zation  and  the  like,   there  will  still  be  efficient  implementations  of 

providers,  both  for  general  performance  and  for  efficient  p 
She  data  structures  defined  by  specific  DTDs. 


CD-Rta:;     FROM  THE  ULTIMATE  SPECIALISTS  IE  IHFOEMATIOR. . . 

■  u  m  the  development  of  text  technology  in  the 

One  of  the  biggest  contributors  to  the  d*™±0pTr  provided  the  initial  fund- 
US  has  been  the  Central  ^""^?^eJ„d*;^lso  a  key  customer  for 
ing  for  Xerox's  hypertext  /-^f  *^ ZZittee  of  the  Intelligence 
Verity' sr^    Ts^  oTin^^iL^^t  coordinate  boty  for  the 
~S  ^.  —7.   »  offers  us  CD-RZx. 

„^  THC  bv  Keieerson  Associates  , 

CD-EBx  is  a  spec  designe d  at  tne  reques     o^th ^      ^  early 

a  CD-ROn  ^^^^^J^l  of^V  on  a  disk  of  export-import 
iaplmentatlon  was  fielded  in  tne  sumrcx  is  currently  working 

7-*«««.i-w  for  the  Canerce  Department ,  and  Helgerson  is  cuxxea.    y  * 

„  to  fxo»  go^rn^  ^=i-  ^  «^2^o  ^ 

r          .  -       i»<.»iHB«tce  cuiuumnit^7     DOC  anc 
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The  CD-RDx  advisory  panel  are  working  on  a  spec  with  Helgerson  Associates; 
Helgerson  is  working  not  just  on  an  implementation,  but  on  a  number  of  ver- 
sions of  the  server  software  to  work  on  a  variety  of  hardware  platforms . 

The  resulting  software  is  government -domain;     That  is,  the  implementations 
as  well  as  the  spec  can  be  freely  copied  throughout  the  government  and  by 
its  direct  contractors.     The  hope,  of  course,  is  that  the  spec  will  also 
make  its  way  out  into  the  commercial  world:     Any  vendor  can  use  it,  and  sell 
its  own  toolkits  and  implementations  of  it  (although  Helgerson  will  have 
some  advantages  by  virtue  of  being  first) .     Since  a  lot  of  data  is  used  by 
both  government  and  commercial  firms,   this  makes  sense. 

The  CD-RDx  vision  of  interoperability  is  broader  than  those  of  WAIS  and  SFQL 

the  issue  here  is  not  just  client -to- server  interoperability,  but  also 
server-environment - to- indexed-data .     That  is,   the  goal  is  to  build  a  range 
of  compatible  CD-RDx  server  engines  so  a  variety  of  operating  environments 
can  all  use  the  same  sets  of  indexed  data.     In  other  words,  an  indexed  data 
disk  should  be  platform- independent .     You  can  take  a  single  disk  and  run  it 
on  a  variety  of  hardware  systems;   the  server  software  engine  appropriate  to 
the  local  operating  environment  will  automatically  load  itself. 

This  is  especially  important  for  government  agencies,  which  want  to  pass 
around  indexed  data  from  server  to  server  among  different  agencies  --  rather 
than  commercial  customers ,  who  generally  only  want  the  same  client  to  work 
with  multiple  servers,   or  on-line  vendors,  who  want  the  same  server  to  work 
for  multiple  customers.      (On  the  other  hand,   CD-RDx  vendors  will  find  them- 
selves able  to  address  more  platforms  and  thus  more  customers  more  easily. 

Basically,   CD-RDx  is  a  set  of  APIs  that  can  front-end  almost  any  CD-ROM  in- 
dexing scheme.     It  hides  the  specifics  of  an  indexing  system,   but  not  the 
logical  organization  of  the  data  or  the  fields  and  categories  into  which 
it's  classified.     Its  APIs  are  akin  to  (but  of  course  incompatible  with) 
those  of  Fulcrum  or  a  number  of  other  vendors'   --  commands  to  define  and 
manage  a  variety  of  indexing  schemes ,  download  word  lists ,  specify  query 
terms  and  parameters,  and  so  forth.     Thus  you  can  build  a  user  interface 
that  a  user  can  use  to  query  the  server  to  see  the  kinds  of  data  and  search 
techniques  he  can  use. . .and  then  he  can  use  them. 

Whereas  SFQL  implicitly  supports  a  rich  data  schema  (with  all  the  overhead 
implied)  ,   CD-RDx  is  a  little  more  pragmatic ..   and  basicallv  lets  you  talk 
directly  to  whatever  indexing  schemes  and  field  structures  happen  to  be 
around,  without  necessarily  trying  tc  integrate  them  into  a  single  model. 
Matthew  Goldwonn  of  TerraLogics ,  a  vendor  of  data  preparation  software  with 
an  orientation  to  maps ,  believes  CD-RDx  is  more  open  to  supporting  maps  and 
other  data-rich  structures  than  SFQL ,  which  he  considers  too  tied  to  the 
airline  industry      In  this  aspect,   CD-RDx  has  some  of  the  gtaxifcle  flavor  of 
wAXS .  but  it  also  has  more  explicit  support  in  the  spec  to  address  the 
spenf^xs  cf  any  indexing  scheme  --  inverted  text,   table  or  contents,  wird 
stic  phrast  lists     err      Thar  Lz     ir   is  generally  for  building  front -enas ; 
applications  to  specific,   structured  data  sets,  rather  than  -passing  through 
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So    how  reel  is  all  this?    We  think  it  could  be  quite  important  if  the  right 
^ple  *et  involved  -  that  is,  connnercial  people  with  a  vested  interest  m 
seeing  it  succeed,  as  well  as  the  beneficiaries  -  authors  who  will  get 
wider    quicker  distribution  of  their  works,  readers  who  will  get  broader  but 
■or.  precise  access  to  the  information  they  seek    and  the  world  at  large 
because  information  will  flow  around  with  a  little  less  friction.     WAIS  it - 
seS  Is  JlmlZ  a  platform  on  which  enterprising  people  will  construct  elabo- 
rate schemes  for  filtering,   describing,  pricing  and  distributing  informa- 
tion     Profit    authors'  pride  and  intellectual  curiosity  will  provide  the 
motivating  forces,  while  WATS  is  the  machinery  that  will  enable  those  forces 
to  be  harnessed. 

We  expect  to  see  WAIS  adopted  from  the  library  community  out    with  support 
fromlnformation  providers  pulled  by  users.     WAIS  will  also  benefit  from  the 
increasingly  organized,  broad  community  of  information  service  users.  As 
they  get  networked,  they  get  more  vocal,  more  organized,   and  better  coor- 
dinated in  making  their  voices  heard.     The  electronic  frontier  is  now  being 
settled  by  people  who  have  money  and  vested  interests  and  the  commercial 
force  to  make  their  voices  heard. 

On  the  other  hand,  in  addition  to  the  WAIS  -laissez-faire  attitude,  the  world 
also  needs  standards  for  precise  manipulation  of  structured  information 
(which  could  in  fact  be  transmitted  via  the  WAIS  protocol).     Here,  SFQL  and 
CD-RDx  are  directly  competitive .     We  expect  SFQL  to  move  from  the  aerospace 
community  to  other  such  industry  groups,  pulled  mostly  by  intra- industry 
trade  groups,  with  a  push  from  software  vendors  such  as  Fulcrum.  CD-RDx 
doesn't  seem  to  have  much  momentum  outside  the  government  as  yet,   but  those 
various  government  users  may  be  able  to  get  some  commercial  users  and 
vendors  excited. 

Vendors  tend  to  resist  standards  -  especially  the  leading  vendors,  who  have 
commercial  advantages  and  expect  the  world  to  adapt  to  them.     Microsoft,  for 
example,  makes  an  analogy  to  SQL  and  likens  its  own  CD-ROM  standards  to 
dBASE;   it  sees  no  need  yet  for  a  broader  client-server  standard  such  as  SQL. 
Eventually,   says  Microsoft's  Rob  Glaser ,   SFQL  will  probably  "work1   for  Mi- 
crosoft, but  right  now  he  sees  no  need  for  it.     This  is  an  interesting  anal- 
ogy, given  the  recent  impact  of  SQL  on  dBASE  --  and  questions  about  how  his - 
torv  might  have  been  different  had  Ashton-Tate  been  more  open  with  dBASE 
(the  Microsoft  posture)  or  more  open  zo  SQL.     The  real  question  is:  Wui 
the  standard  of  the  future  be  Microsoft's,   or  will  it  be  SFQL  or  CD-RDx  or 
something  else' 

Overall    more  and  more  users  are  beginning  to  use  several  information  ser- 
vices and  CD-ROMs  and  want  a  common  interface.     Rather  than  create  a  regu- 
lated industry  (as  with  telephones)  where  you  have  one  interface  because  yen: 
have  one  provider,  we  have  the  opportunity  to  create  an  iaxdnstry  of  vigorous 
competitors  operating  with  jwt  one  or  two  standard  interferes  because 
that's  what  customers  are  asking  for 
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BTHTHEF  0.5  —  UBTAST  UPDATE:     THE  USER  DRIBS  BEST 

One  of  the  advantages  of  the  WAIS  protocol  discussed  earlier  in  this  issue 
is  that  it  doesn't  interfere  with  a  user's  best  efforts  to  get  what  he 
wants.     Although  there's  a  lot  of  power  in  automation  and  groupwar^  tools, 
people  trying  to  work  together  frequently  need  facilitation  rather  than  a 
fancy  feature  set.     Working  together  should  be  made  simpler,  not  "enhanced ." 
Specifically,  software  shouldn't  try  to  be  any  smarter  than  it  can  be.  An 
excellent  example  of  this  principle  is  ON  Technology ' s  Instant  Update. 

Instant  Update  doesn't  do  much.     It  just  lets  people  share  virtual  paper, 
update  it,  and  pass  it  around.     It  flags  conflicts  but  doesn't  resolve  them: 
The  last  one  to  update  a  paragraph  (the  basic  unit  within  an  Instant  Update 
document)  wins.     It's  not  a  fancy  tool  to  edit  share  documents,  nor  a  system 
to  monitor  people's  movements,  tell  them  what  to  do  or  manage  conflicts. 

But  consider  it  in  a  more  positive  light:     It's  a  way  to  send  messages  in 
context,   like  sticky  paper  for  collecting  feedback.     Instead  of  getting  an- 
swers to  a  question  you've  forgotten,  you  get  updates  to  a  shared  memo.  It 
may  include  a  wild  projection,  a  table  of  assignments,  a  calendar  page,  or 
anything  that  can  be  imported  into  a  standard  Mac  document.     It  has  the  ap- 
peal of  Post -It  notes  --  vanilla  enough  that  they  can  do  almost  anything  you 
can  think  of.     When  computers  are  truly  ubiquitous,  there's  sure  to  be  a 
copy  of  Instant  Update  on  every  refrigerator. 
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K.vin  Ti.ne.  Chan..  W-d.  *PP>«  <40B>  996-1010  or   (408)  974- 
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227-8431  remittee  flntelligence  Connmity  Staff), 
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Release  1.0  Calendar 

May  5-8  *Demo  '91:     The  annual  personal  computer  industry  product 

review  and  demonstration  -  Palm  Springs,   CA.     Sponsored  by 
P.C.  Letter.     Gall  Kim  Marker,   (415)  592-8880. 

May  6-7  Mobile  Data  conference  -  Cambridge.     Sponsored  by  Waters  In- 

formation Services.     Call  Betsy  Martens ,    (607)  770-8535. 

May  6-8  The  1991  Computer  services  &  consultants  executive  conference 

-  Orlando.     Sponsored  by  IBM.     With  James  Cannavino,  George 
Conrades ,   Joseph  Guglielmi,  Ellen  Hancock.     Call  Hal  Topper, 
(404)   238-4228;   overseas  call  Don  Avery ,   1   (416)  443-4606. 

May  7-9  *Hational  Online  meeting  -  New  York  City.     Sponsored  by 

Learned  Information.     Call  John  Yersak,    (609)  654-6266. 

May  8  Massachusetts  Computer  Software  Council's  spring  membership 

meeting  -  Boston.     Keynote  speaker:   Steven  Jobs.     Call  Joyce 
Plotkin,    (617)  437-0600. 

May  12-13  The  thirteenth  international  conference  on  software  engineer- 

ing -  Austin,  TX.     Sponsored  by  ACM,   IEEE  Computer  Society. 
Call  Barbara  Smith,    (512)  338-3336. 

May  14-17  Quality  Week  1991:     Attaining  realistic  productivity  and 

quality  gains  -  San  Francisco.     Sponsored  by  Software  Re- 
search.    With  Dr.   Boris  Beizer.     Call  Ed  Miller,    (415)  957- 
1441  or  (800)  942-SOFT. 

May  15  PC  user  group  meeting  -  New  York  City.     With  Jerry  Kaplan  and 

Robert  Carr ,   GO.     Call  John  McMullen,    (914)  245-2734. 

May  19-22  *International  Markup  '91  -  Lugano,   Switzerland.  Sponsor: 

Graphic  Communications  Association.     SGML  etc.   Keynote  by 
Esther  Dyson.     Call  Joy  Blake,    (703)  519-8160. 

May  19-22  tta  spring  conference  -  Palm  Springs,   CA.   Sponsor:  Informa- 

tion Industry  Ass'n.     Call  Linda  Cunningham,    (202)  639-8262. 

May  19-23  Interna t ioraal  DB2  users  group:     Distributing  the  experience  - 

San  Francisco.     Speakers  include  Chris  Date,  Michael  Stone- 
braker.     Call  Larry  Fleischman,   (312)  644-6610. 

May  20-23  Spring  Comdex  -  Atlanta,  GA.     Sponsored  by  the  Interface 

Group.     Call  Elizabeth  Moody,.  (617)  449-6600.     Includes  Win- 
dows World;   coincides  with  Interface/91. 

May  21-23  UNIX  &  Open  Systems :     Applications ,  tools  &  solutions  for  the 

'90s   -  Santa  Bar Dart.     Sponsored  by  Patty  Seybold,  UniForum 
and  X  Oper.       with  David  Stone,   DEC;   Peter  Weinberger,  AT&T 
USL;   Ira  Goldstein,   OSF ;   Pete  Peterson,   WordPerfect;  Charles 
House,   HP.      Call  Deborah  Hay,    (617;  7-42-5200. 

May  21-23  Silicon  Graphics  developer's  forma  -  San  Francisco  Spon- 

sored by  Silicon  Graphics.     Gall  Debbie  Chen.    (415)  335-1392 

May  22-23  Investing  in  venture  capital  -  New  York  Citv      Sponsored  ov 

rae  institute  tor  j.ntemationa.x  Research .      waxx  xom  judge  , 
(212:    82-6-126C  or   (8'0C)  345-8016 

May  27-31  Avignor.   '91  •     Expert  systems  4  their  applications  -  Avigncv- 

.    Franc  e .     Sponsored  dv  AFlA ,  ARC  ,  ECCAl  &  JSA.I .      Calx  Jean- 

May  2&-3C  Database  Boric  -  .  Wasr.ingtor.     DC       ::-spor.screc  by  _\l£it£.- 

jor.s'ii  t i_Ti t   anc  iovemasr.:   Conourer  News       SoeaKers  inc.uat 

Charles   Bachmar.     Rober'.  Epscei:.     -man-;,    iuzi^c.     Caco:  Steir. 
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Jane  3-6  Macworld :Expo/BerLin  -  Berlin,  Germany.     Sponsored  by  World 

Expo  Corporation.     Call  Deborah  Paul ,   (50B)  879-6700 . 

June  3-7  *Object  World  -  San  Francisco.     Co-sponsored  by  the  Object 

Management  Group  and  World  Expo  Corp.     Businesspepole ' s  an- 
swer to  OOPSLA.     Call  Dave  Bradway ,    (508)  820-8123. 

June  4-5  Customer  care  conference  -  Chicago.     Sponsored  by  Software 

Strategies.     Speakers  include  Barbara  Brizdle,  Richard  Brock, 
Pat  Landry.     Call  John  Jacobsen,   (203)  335-6090. 

June  4-6  *Digital  World  -  Beverly  Hills,   CA.     Sponsored  by  Seybold 

Seminars.   Digital  data  meets  media  and  communication  in- 
dustries.    Speakers  include  Steven  Jobs,  Trip  Hawkins,  Robert 
Winter.     Call  Beth  Sadler,   (213)  457-5850. 

June  9-12  *2nd  annual  SPA  European  conference  -  Cannes,  France. 

Sponsored  by  SPA.     Call  Ken  Wasch,    (202)  452-1600. 

June  9-12  Expert  Comunications  '91  -  San  Francisco.     Sponsored  by 

Graphic  Communications  Association  and  Davis  Review.  Call 
Mills  Davis,    (202)   667-6400  or  Joy  Blake,    (703)  519-8160. 

June  9-16  Poznan  international  fair  -  Poznan,  Poland.     US  exhibits 

sponsored  by  Department  of  Commerce,  Eastern  Europe  Business 
Information  Center.     Call  Bill  Vigneault,    (202)  377-1793. 

June  17-18  Virtual  Worlds:     Real  challenges  -  Menlo  Park,   CA.  Sponsored 

by  SRI  International,  The  David  Sarnoff  Research  Center  and 
VPL  Research.  Speakers  include  Jaron  Lanier,  VPL  Research; 
Warren  Robinett,  University  of  North  Carolina;  John  Thomas, 
NYNEX  Corporation.     Call  Teresa  Middleton,    (415)  859-3382. 

June  17-18  Technical  product  development  through  strategic  customer  sup - 

port  -  San  Francisco.     Sponsored  by  the  Institute  for  Int'l 
Research.     Call  Kathleen  Erb,    (212)  826-1260  or  (800)  345- 

8016    •  ,        ,  ,  -T- 

June  17-21  *International  Computer  Forum  -  Moscow      Sponsored  by  the  In- 

ternational Computer  Club.  Call  Levon  Amdilyan,  7  (095)  921- 
09-02,  or  "levon"  on  MCI  Mail  at  439-1034;  or  Esther  Dyson  at 
1  (212)  758-3434. 

June  18-21  Videotex  91:     Broadening  the  consumer  .arket  -  Crystal  City, 

VA.     Sponsored  by  Videotex  Industry  Association.     Call  Debbie 
Tritle,    (301)  495-4955. 
Jone  19-21  Sunerco-puting  USA/Pacific  91  -  Santa  Clara .     Sponsored  ay 

Meridian  Pacific  Group.     Call  Gerard  Parker,   (415)  381-ZZ55. 
June  24-27  SCOOP  East  '91  -  East  Rutherford,  NJ .     Sponsored  by  the  Wang 

Institute  of  Boston  University  and  the  Journal  of  Object 
Oriented  Programming.     Call  Bob  Daniels,    (508)  649-9731. 
First  international  Windows  3.0  developers  conference  -  Santa 
Clara      Sponsored  bv  The  Wang  Institute  of  Boston  University. 
Keynote  speakers  include    Bob  Muglia ,  Microsoft;  Eugene  Wang, 
Borland  International.     Call  Anriree  Fontaine,    (508)  649-9731 
PC  OTO  -  New  York  City.     Sponsor:  Blenheim.     With  Ray 
Noorda.     Call  Annie  Scully,    (201)  569-8542  or   (800)  444-EXPC 
Win  l  1— rti ■   '91  -  London,  UK      Sponsored  by  Blenheim  Unxone . 
Call  Lynn*  Davey,   44   (81)  8*8-4466. 

*w*chxne  Translation  Susmit  III  -  Washington  ZZ  Sponsored 
bv  the  Center  for  Machine  Translation,   Carnegie  Mellon  Dm- 

v-ersin-       Call  Jaime   uarbone —       * —     _fcc-   - 

mtsunnnirfocs   emu.  ecu. 
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July  9-14  *PC  Forua  -  Moscow.     Organized  by  TDG  World  Expo  and  Informa- 

tion Computer  Enterprise,  USSR;  co-sponsored  by  several  USSR 
state  committees.     Call  Terence  Coe ,   (508)  879-6700. 

July  14-19  *AAAI  conference  -  Anaheim.     Sponsored  by  American  Associa- 

tion _£or  .Artificial  Intelligence,     Also  includes  Innovative 
Applications  of  AI .     Call  Carol  Hamilton,    (415)  328-3123. 

July  15-18  Network  computing  conference  and  exposition  -  Washington,  DC 

Sponsored  by  IDG  World  Expo  Corporation.     Call  Brenda  Cone, 
(800)  225-4698  or  (508)  879-6700, 

July  15-18  Coannmi.car.ion  networks  -  San  Francisco.     Sponsored  by  World 

Expo.     Keynotes:  Mark  Baker,  British  Telecom;   Eric  Schmidt, 
Sun  Tech;  Ambassador  Bradley  Holmes.     Call  Debra  Anderson, 
(617)  769-8950  or  (800)  225-4698. 

July  17  *Software  Entrepreneurs'  Forum  -  Palo  Altc ,   CA.     Dinner  talk 

by  Esther  Dyson.     Call  Barbara  Cass,    (415)  857-1110. 

July  23-26  Artificial  intelligence  and  the  help  desk  -  San  Francisco. 

Sponsored  by  the  Help  Desk  Institute.     Call  Elaine  Worthing- 
ton,    (719)  531-5138. 

July  28-Aug  2        *STGGKAPH  '91  -  Las  Vegas.     Sponsored  by  ACM.     Art  meets  com- 
puters:    The  place  to  see  and  be  seen.     Call  Jackie  Groszek, 
(312)  644-6610. 

July  29-Aug  1        Tools  U.S.A.    '91  -  Santa  Barbara.     Sponsored  by  Interactive 
Software  Engineering.     Call  Bertrand  Meyer,    (805)  685-1006. 

August  5-8  International  workshop  on  human- computer  interaction  -  Mos- 

cow.    Sponsored  by  California  State  University  and  the  Inter- 
national Centre  for  Scientific  and  Technical  Information, 
Moscow.     Contacts:   Larry  Press,   (213)  475-6515,   fax  (213) 
516-3664,   e-mail  lpress@venera . isi . edu;   or  Yuri  Gornostaev,  7 
(095)  198-72-41  or  enir@iaeal . bitnet . 

August  6-9  Macworld  Expo  -  Boston.     Sponsored  by  World  Expo  Corporation. 

Call  Deborah  Paul,    (508)  879-6700. 

August  11-13  *GeoCon/91  -  Cambridge,  MA.     Sponsored  by  Soft • letter ,  An 

international  product  showcase  for  European,   Canadian,  Asian 
and  Latin  American  developers  who  seek  U.S.  publishing  or 
partnership  contacts.     Call  Jeff  Tarter,    (617)  924-3944. 

August  14-16  Windows  &  OS/2  -  Boston.     Co- sponsored  by  PC  Week  and  CM  Ven- 

tures.    Call  John  Bourgein,    (415)  601-5000. 

Septener  4-6        UttJJL  Open  Solutions  -  San  Jose .     Sponsor :   Interface  Group  . 

Call  Elizabeth  Meagher,    (617)  449-6600  or  (800)  325-8850. 

September  11-13  Breakaway  1991  -  Atlantic  City.  KJ.  Sponsored  by  ABCD .  Re- 
sellers and  vendors  trade  tips  and  "frank  disucssion  "  Call 
Debbie  Keating.    (601)  977-9033 

fTi  |H  nilii  i   U-14    Software  Publishers  Association  animal  conference  -  Orlando . 

Sponsored  by  SPA.     Call  Ken  Wasch,   {"202)  452-1600 

Septeaber  12-14  *tULt  -  Opic ,  France  Sponsored  by  Dasar  Call  Alex  Vieux, 
(415)  321-5544. 

^rptwtifr  2G-21    Sources  1991:     Asian  financing  *  alliances  -  Santa  Clara. 

Sponsor e d  by  Asian  American  Manufacturer:  Association .  Call 
George  Koc      '415"  321-AAMA 
|il  1'iain  i-  22-24    *lftmrti  92  -  Laguna  Niguel ,   CA .     Sponsored  by  P  C     Letter /PCV 
Communications       Call  Tree-  Beiers  .    '415 '  592-8BBC 

September  25-2""     *Second  European  conference  on  computer  -  supported  cooperative 
work  -  Amsterdam       Knowledge  workers  anc  academics     unite  ' 
Organizes  by  the  Center  for  Innovation  ar.i  Cooperative  Tech- 
nology c:   tht  Ur.iversir-   ci   Amsterdac        "The   language  cf 
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Sept  30-0ct  1 

Sept  30 -Oct  4 

October  1-3 
October  1-4 

October  6-ll 
October  7-11 

October  14-18 
October  15-17 
October  17-21 

October  21-23 
October  21-25 

October  27-29 

Oct  30-Hov  1 
Bowember  4-7 
Bovember  10-13 


12-  14 

13-  15 

17 -2C 
19-21 


cooperation  is  English.)     Call  Mike  Robinson  or  Liam  Bannon, 
31  (-20)  525  1250/1225;  fax,  31  (20)  5251211;  e-mail,  Ban- 
non@leam.ucd.ie;  or  Charlie  Grantham,  1  (415)  370-174; 
cegrant#well . sf . ca . us . 

Virtual  aeality  conference  -  San  Francisco.     Sponsored  by  the 
Meckler  Corporation.     Call  Marilyn  Reed,    (203)  226-6967  or 
(800)  635-5537. 

★Seybold  Conference  -  San  Jose.     The  leading  event  m  the 
computer  publishing  community.     Sponsored  by  Seybold  Semi- 
nars/Ziff.     Call  Kevin  Howard  or  Beth  Sadler,    (213)  457-5850. 
IHFO  '91  -  New  York  City.     Sponsored  by_  Cahners  Exposition 
Group.     Call  Marilyn  Harrington,   (203)  352-8477. 
Seybold  computer  publishing  conference  &  exposition  -  San 
Jose      Sponsored  by  Seybold  Seminars.   The  evolving  process  of 
communication.     Call  Beth  Sadler,   (213)  457-5850. 
*OOPSLA  '91  -  Phoenix.     Sponsored  by  ACM.     Call  John 
Richards,    (914)  784-7731. 

Interop  '91  -  San  Jose.  Sponsored  by  Advanced  Computing  En- 
vironments/Ziff .  With  Ellen  Hancock,  IBM  Communication  Sys- 
tems.    Call  Dan  Lynch,    (415)  941-3399. 

CD-BOM  Expo  -  Washington,  DC.     Sponsored  by  World  Expo  Corpo- 
ration.    Call  Terry  Merrell,    (508)  879-6700. 
NetWorld  '91  -  Dallas.     Sponsored  by  Bruno  Blenheim.  Call 
Annie  Scully,    (201)  569-8542  or. (800)  444-EXPO. 
USA  Showcase  '91  -  Budapest.     Co-sponsored  by  the  Hungarian 
Ministry  of  Trade,  the  Hungarian  Chamber  of  Commerce  and  the 
American  Chamber  of  Commerce  in  Budapest.     Call  Jay  Bowman  at 
(713)  266-0610. 

Twelfth  annual  Alex.  Brown  technology  seminar  -  Baltimore. 
Primarily  for  investors.     Call  Lori  Bresnick,    (301)  727-1700. 
★Comdex  -  Las  Vegas.     So  wonderful  they  couldn't  wait  until 
November?    Whatever  the  reason....     Sponsored  by  the  Inter- 
face Group.     Call  Elizabeth  Moody,   (617)  449-6600. 
The  Classic  -  Monterey,  CA.     Sponsored  by  the  American  Elec- 
tronics Association,   for  cute  companies  and  eager  investors. 
Call  Flo  Lewis,    (408)  987-4200. 

URTX  Expo  -  New  York  City.     Sponsor:  Blenheim  Expositions. 
Keynote  by  Steve  Jobs.     Call  Pan  O'Neill,    (512)  343-1111. 
ADAPSO  fall  management  conference  -  San  Francisco.  Sponsored 
by  ADAPS0.     Call  Shirley  Price,   (703)  284--5355. 
★★Second  East -Best  High-Tech  Form  -  Warsaw  (Prague  in  1992) 
Sponsored  bv  EDventure  Holdings.     With  a  roster  of  serious  - 
minded  entrepreneurs  and  vendors  from  East  and  West.  Don't 
lust  come  to  listen  to  advice:   come  to  mingle  with  the  people 
making  it  happen.     Call  Daphne  Kis ,   1  (212)  758-3434  or  fax 
(212)  832-1720;  MCI  Mail:  EDventure,  443-1400. 
Unicorn  '91  -  Washington,   DC.     Sponsored  by  North  American 
lEiecommaiiicarions  Aas'n.     Call  Susan  R/ba,    (202)  296-980C . 
*Gom£*mo  m»"fl--  ■  y  *91  -  Budapest      Sponsored  by  the  Hungarian 
Telecommunications  Scientific  Society      Gei_  Rarer  ^entr.- 
miglia,   (703)  527-SOOO. 

HA  ammnal  eoDventiox  -  Oriandc       Sponsor     Informs tier  Indus- 
try Ass'n      Call  Lindt  Cunningham     '202  £39-8261 
PC  -   Chicago       Sponsored  b^  Brunc  Blenheim       Ca—  Steve 

Fefaer .    (20-     5e?<-85<+2  or   (800)  *um*-EXPC 
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December  2-4          ^Alliance  91  -  Tokyo,   Japan,     Sponsored  by  Harvard  Business 
School  Ass'n.     Strategic  alliances  with  Japanese  companies 
Call  Mark  Francis  or  Yasuhito  Mikamo,   (415)  742-0757. 

December  3-5         European  Publishing  conference  -  The  Hague,  Holland.  Spon- 
sored by  Seybold  Seminars.     Contact:   Laurel  Brunner,   44  (323) 
410561  or  fax,   44   (323)  410279. 

December  15-18      *Hypertext  '91  -  San  Antonio,  TX.     Third  international  con- 
ference on  hypertext.     Sponsored  by  ACM.     Call  Janet  Walker, 
(409)  845-0298,   e-mail  leggett@bush .  tamu.  edu . 

Please  let  us  know  about  any  other  events  we  should  Include.      --  Denise  DuBois 

*The  asterisks  indicate  events  we  plan  to  attend.     Lack  of  an  asterisk  is  no 
indication  of  lack  of  merit. 
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